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Abstract 

Background: Specific genetic contributions for preeclampsia (PE) are currently unknown. This genome-wide 
association study (GWAS) aims to identify maternal single nucleotide polymorphisms (SNPs) and copy-number 
variants (CNVs) involved in the etiology of PE. 

Methods: A genome-wide scan was performed on 177 PE cases (diagnosed according to National Heart, Lung and 
Blood Institute guidelines) and 116 normotensive controls. White female study subjects from Iowa were genotyped 
on Affymetrix SNP 6.0 microarrays. CNV calls made using a combination of four detection algorithms (Birdseye, 
Canary, PennCNV, and QuantiSNP) were merged using CNVision and screened with stringent prioritization criteria. 
Due to limited DNA quantities and the deleterious nature of copy-number deletions, it was decided a priori that 
only deletions would be selected for assay on the entire case-control dataset using quantitative real-time PCR. 

Results: The top four SNP candidates had an allelic or genotypic p-value between 10" 5 and 10" 6 , however, none 
surpassed the Bonferroni-corrected significance threshold. Three recurrent rare deletions meeting prioritization criteria 
detected in multiple cases were selected for targeted genotyping. A locus of particular interest was found showing an 
enrichment of case deletions in 19q 13.31 (5/169 cases and 1/1 14 controls), which encompasses the PSG 1 1 gene 
contiguous to a highly plastic genomic region. All algorithm calls for these regions were assay confirmed. 

Conclusions: CNVs may confer risk for PE and represent interesting regions that warrant further investigation. Top SNP 
candidates identified from the GWAS, although not genome-wide significant, may be useful to inform future studies in 
PE genetics. 

Keywords: Copy-number variant, Genome-wide association study, Microarray analysis, Preeclampsia, Single nucleotide 
polymorphism 



Background 

Preeclampsia (PE) is a pregnancy-specific complication 
which affects 2-7% of pregnancies [1]. Recognized as a 
leading cause of maternal and fetal morbidity and mor- 
tality worldwide, PE is characterized by new onset hyper- 
tension and proteinuria with or without other multi- 
system disorders. Family-based studies in several geo- 
graphically and ethnically diverse populations have 
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demonstrated the familial nature of PE [2-5] . Despite the 
evidence for a genetic basis of PE, its exact etiology 
remains undefined. 

Numerous candidate genes have been implicated in the 
pathogenesis of PE; particularly genes involved in immune 
maladaptation, placental ischemia, and increased oxidative 
stress [6]. Recent research found that the complement sys- 
tem may play a role in PE [7]. While sequence variants in 
over 70 genes have been investigated, the majority of stud- 
ies have focused on only a handful of these genes [8,9]. 
Hundreds of candidate gene studies have been conducted, 
but findings have been inconsistent [10]. Selection of can- 
didate genes for investigation is limited by an incomplete 



O© 2012 Zhao et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative 
BiolVlGCl C6ntTcll Commons Attribution License (http://creativecommons.Org/licenses/by/2.0), which permits unrestricted use, distribution, and 
reproduction in any medium, provided the original work is properly cited. 



Zhao et al. BMC Pregnancy and Childbirth 2012, 12:61 
http://www.biomedcentral.com/1471-2393/12/61 



Page 2 of 10 



understanding of biological processes involved in the 
pathogenesis of PE [8]. Furthermore, most existing studies 
examining the genetic basis of PE have focused on single 
nucleotide polymorphisms (SNPs), which explain only a 
small proportion of the overall heritability of complex 
disorders. 

Copy-number variants (CNVs), another type of genetic 
variation, may contribute to the missing heritability of 
complex disorders such as PE [11]. An estimated 13% of 
the human genome is believed to be copy- number vari- 
able [12]. These variants range from less than one kilo- 
base (kb) to several megabases (Mb) in size and include 
deletions and amplifications [13]. Functionally relevant 
CNVs can alter gene function or regulation by various 
proposed mechanisms, including alteration of gene dos- 
age, gene interruption, gene fusion, and alteration of a 
genes position relative to regulatory elements. CNVs 
may induce phenotypic changes. The phenotypic conse- 
quences of these alterations depend on the nature and 
extent of the deleted or duplicated DNA sequence 
[14,15]. The disruption of genes by CNVs has been 
linked to numerous complex disorders, including neuro- 
psychiatric and autoimmune disorders [16]. However, to 
date there have been no reports examining an associ- 
ation between CNVs and PE. 

Given the current limited state of knowledge on the 
genetics of PE, a genome-wide association study (GWAS) 
was conducted to identify potential PE-associated SNPs 
and CNVs using a case-control study design. This is the 
first reported GWAS on PE. 

Materials and methods 

Ethics statement 

This study was approved by the University of Iowa Insti- 
tutional Review Board and the Yale University Human 
Investigation Committee. 

Study population 

Female subjects were recruited through the SOPHIA 
study— a case-control study designed to examine the roles 
of maternal-fetal human leukocyte antigen and sexual his- 
tory in PE [17]. A total of 3078 primaparous mothers who 
gave birth in Iowa from August 2002 to May 2005 were 
identified from electronic birth certificates provided by 
the Iowa Department of Public Health as potential sub- 
jects. Potential PE cases were selected from primiparous 
women who were "check-box positive" on their infants 
birth certificate for pregnancy-induced hypertension or 
eclampsia. Potential controls were a random sample of 
primiparas who had no indication of hypertension on 
their infants birth certificate. Willing subjects were 
screened for initial eligibility and excluded based on any 
of the following criteria: age <18 years at delivery; non- 
English-speaking; history of an autoimmune disease 



(e.g. systemic lupus erythematosus, insulin-dependent dia- 
betes mellitus, rheumatoid arthritis); recurrent spontaneous 
abortion (>3 sequential pregnancy losses); chronic hyper- 
tension; plural gestations; major congenital anomalies; in- 
fant death; or seriously ill infant. The final case status of all 
eligible and consenting subjects was determined using clin- 
ical information collected through extensive telephone 
interview and chart review. Buccal samples were self- 
collected and mailed by the study subjects using methods 
of collection, storage, and packaging that would maximize 
DNA yields from cytobrushes [17]. 

Figure 1 summarizes the subject selection process. 
Based on information from interview and chart review, 
274 PE cases and 190 normotensive controls were ascer- 
tained. The final number of cases and controls, who 
additionally consented to future genetic studies, was 225 
and 150, respectively. Due to concerns about population 
stratification and the small numbers of subjects of other 
races/ethnicities, only white females were included in 
the present study (n = 196 cases and 137 controls). 
Among these subjects, a total of 177 cases (mean age: 

27.53 ±5.01 years) and 116 controls (mean age: 

27.54 ±5.17 years) had sufficient DNA for genome-wide 
SNP genotyping. 

Phenotype definition 

PE was defined according to National Heart, Lung and 
Blood Institute (NHLBI) guidelines as having de novo 
hypertension (systolic blood pressure >140 mmHg or dia- 
stolic blood pressure >90 mmHg on two or more occa- 
sions at least six hours apart after the 20th week of 
gestation) and accompanying proteinuria (urine protein 
concentration >300 mg/L, equivalent to dipstick protein 
test value of 1+ from two or more specimens collected at 
least four hours apart; one or more urinary dipstick values 
of 2+ near the end of pregnancy; one or more catheterized 
dipstick value of 1+ during delivery hospitalization; or a 
24- hour urine collection with protein >300 mg). Potential 
cases were excluded if pre-existing hypertension could not 
be ruled out, only partial criteria for PE were met, or a de- 
finitive diagnosis could not be made due to incomplete in- 
formation. Potential controls were excluded if their 
medical records included any indication of high blood 
pressure (systolic blood pressure >140 mmHg or diastolic 
blood pressure >90 mmHg) in the prenatal or postpartum 
period, two or more high blood pressure readings in the 
intrapartum period, or any indication of proteinuria dur- 
ing pregnancy (1+ on dipstick protein testing on two or 
more occasions). 

GWAS genotyping 

Buccal cell DNA was extracted from cytobrush samples 
using Puregene DNA Tissue Kits (Gentra Systems, Min- 
neapolis, MN) following the manufacturers protocol, 
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Iowa births to primaparous women from August 2002 to May 2005 (n = 3078) 



Assessed for eligibility (n = 1850) 



Unable to contact (n = 1228) 



Screen eligible (n = 1747) 



Excluded (n = 103) 

• Not meeting eligibility criteria (n = 41) 

• Other reasons (n = 62) 



Interviewed (n= 1188) 



Excluded (n = 556) 

• Declined interview (n = 493) 

• Incomplete interview (n = 66) 



Consented (n = 982) 



Declined consent (n = 206) 



Chart review (n = 951) 



Incomplete/unavailable chart review (n = 31) 



Definite PE (n = 274) 



Excluded (n = 487) 

• PIH/other (n = 480) 

• Probable PE (n = 7) 



Normotensive controls (n = 190) 



Consented to future genetic studies 
(n = 225) 



Consented to future genetic studies 
(n= 150) 



White ethnicity (n = 196) 



White ethnicity (n = 137) 



Genotyped (n = 177) 



Genotyped (n = 116) 



Figure 1 Flow chart of the SOPHIA study subject recruitment process. 



with minor modifications [17]. After extraction, DNA 
samples were assessed for quality by running on a 1% 
agarose gel. Genotyping was performed at the Rockefel- 
ler University Genomics Resource Center using Affyme- 
trix Genome- Wide Human SNP Array 6.0 (Affymetrix, 
Santa Clara, CA) according to the manufacturers 
recommended protocol. 

Sample quality was assessed using Dynamic Model al- 
gorithm and genotyping calls were generated using Bird- 
seed algorithm in Genotyping Console 4.0 (Affymetrix). 
Samples with a quality control (QC) call rate (based on a 
subset of 3022 SNP markers) less than the default 
threshold of 86% were excluded (n = 1). This recom- 
mended QC call rate threshold is well correlated with 



Birdseed call rate and concordance (>99.5%) based on 
HapMap data (Affymetrix, 2012; personal communica- 
tion). The mean QC call rate across the remaining sam- 
ples (n = 292) was 94.4%. 

SNP analysis 

Mitochondrial SNPs, SNPs that were monomorphic or 
contained only heterozygotes, SNPs that significantly 
deviated from Hardy- Weinberg equilibrium, or SNPs with 
call rates less than 95% among cases or controls were 
deemed to have failed QC and excluded (see Additional 
file 1: Table SI for SNP genotyping data quality summary). 

Individual SNPs were tested for both allelic and geno- 
typic associations by calculating Fisher s exact ^-values 



Zhao et al. BMC Pregnancy and Childbirth 2012, 12:61 
http://www.biomedcentral.com/1471-2393/12/61 



Page 4 of 10 



and using a strict Bonferroni-corrected 0.05 genome- 
wide significance threshold of 7.1 x 10~ 8 (a = 0.05/ 
705,969). Presence of residual population stratification 
was assessed by performing a principal components ana- 
lysis using the EIGENSTRAT method in EIGENSOFT 
version 3.0 (http://genepath.med.harvard.edu/~reich/ 
Software.htm) [18]. 

CNV detection 

Four algorithms served to identify CNVs from the 
genome-wide SNP data. Three algorithms, Birdseye [19], 
PennCNV June 16, 2011 version [20], and QuantiSNP ver- 
sion 2.3 beta [21], implement a hidden Markov model that 
integrates multiple sources of information, including log R 
ratio (LRR; a measure of total signal intensity of probes) 
and B allele frequency (BAF; a measure of relative inten- 
sity ratio of allelic probes), to infer CNV calls for individ- 
ual genotyped samples. An example of LRR and BAF plots 
for a region called and confirmed as a deletion is shown in 
Figure 2. The last algorithm, Canary [19], utilizes a one- 
dimensional Gaussian mixture model to detect common 
CNVs. Birdseye and Canary were run as part of the Bird- 
suite version 1.5.5 toolset. The unified output of Birdseye 
and Canary was used for further analysis and these two 



algorithms are hereafter referred to as one algorithm, 
Birdsuite. Detection algorithms were run under default 
settings for Affymetrix SNP 6.0 microarray using LRR 
and/or BAF data from all SNP and copy-number probes 
in all 293 genotyped subjects. All samples were processed 
in one batch. The presence of CNVs on the X chromo- 
some is of interest as epigenetic investigation found a link 
between X chromosome inactivation and PE among white 
females [22]. Thus, both autosomal and X chromosomes 
were analyzed. 

A large number of algorithm calls may be indicative of 
low sample quality [23]. To assess sample quality, the 
right-skewed distribution of sample calls was first log- 
transformed for each algorithm. A sample was consid- 
ered to have failed QC and removed from analysis (n = 8 
cases and 2 controls) if it had an extremely large number 
of CNV calls, defined as having a value greater than 
three standard deviations from the mean of the log- 
transformed number of CNV calls per sample for at least 
one of the algorithms. 

A modified CNVision program [24] was used to 
merge, analyze, and annotate the outputs of Birdsuite, 
PennCNV, and QuantiSNP. The merge function of 
CNVision identifies and merges CNV calls made by all 
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Figure 2 Representative LRR and BAF plots for a genomic region called and assay confirmed as a deletion. LRR and BAF values for each 
probe are represented as dots. The vertical grey bars delineate the boundaries of the algorithm-detected deletion (chrl 3:83.004-83.045 Mb). LRR 
values for the SNP and copy-number probes in the deletion (red dots) drop to the -0.5 region and BAF values for the SNP probes cluster 
randomly around 0 or 1. In comparison, the flanking normal chromosomal regions have LRR values centered around zero with three BAF clusters 
(blue dots). 
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algorithms that have overlap of >1 base pair and deter- 
mines the percentage of call overlap by algorithm within 
this region. Merged CNV calls were excluded 
(n = 210,583) when at least one of the following condi- 
tions were met: 1) <50% overlap between two algo- 
rithms and <25% overlap among three algorithms; 2) 
less than ten consecutive SNP or copy-number probes; 
or 3) both deletion and amplification calls made by dif- 
ferent algorithms in the same genomic region in a given 
sample. 

CNV prioritization 

An objective prioritization strategy was employed to 
generate a list of candidate CNV regions most enriched 
among cases. Odds ratios (ORs) were calculated for this 
purpose. Therefore, CNVs with ORs > 2.5, comparing PE 
cases and normotensive controls, and CNVs called in >3 
cases and absent in controls were selected for additional 
consideration. Recurrent regions overlapping centro- 
meric and telomeric regions according to PennCNV 
definitions or containing less than five consecutive 
microarray probes were excluded. The normal expected 
frequency of these shortlisted CNVs was assessed in an 
independent comparison group of white female controls 
(n = 774) genotyped using Affymetrix SNP 6.0 microar- 
rays from a GWAS of schizophrenia (NCBI study acces- 
sion: phs000021.v3.p2) [25]. As the schizophrenia study 
controls were not screened with the same criteria as 
SOPHIA controls, it was expected that 2-7% of these 
women who become pregnant would develop PE. How- 
ever, assuming a positive relationship exists, this selec- 
tion bias may in fact attenuate the association, providing 
a conservative estimate of risk for the prioritization 
process. CNVs were detected in the same manner as the 
PE cases and controls; however, samples were batch- 
processed by 96-well plate. The same sample QC and 
merged CNV call criteria were applied. Regions were 
further considered when ORs comparing PE cases and 
schizophrenia study controls were positive with their 
95% confidence intervals excluding the null value (OR = 1) 
or when calls were absent among controls. 

As DNA quantities were very limited, it was decided a 
priori that only copy-number deletions would be 
selected for assay confirmation due to their generally 
more deleterious nature relative to amplifications [26]. 
Regions to be assayed in the entire case-control dataset 
using quantitative real-time PCR (RT-qPCR) were 
selected based on the presence of genes at or near 
(<100 kb) the CNV, the availability of DNA for samples 
displaying the CNV, and having the majority of samples 
without the deletion be copy normal. LRR and BAF plots 
were also visually inspected for a clear change in probe 
hybridization intensity and zygosity, respectively, to en- 
sure patterns consistent with calls. 



Copy-number genotyping 

Copy-number genotyping was performed using TaqMan 
Copy Number Assays (Applied Biosystems, Foster City, 
CA) following the manufacturers suggested protocol. All 
reactions were performed in triplicate with 5 ng of sam- 
ple DNA for each reaction. The copy-number assay 
detects the target genomic region and consists of a FAM 
dye-labeled minor groove binder probe and unlabeled 
PCR primers. This target assay was amplified simultan- 
eously with a genomic reference assay, which includes a 
VIC dye-labeled TAMRA probe and primers, in a duplex 
RT-qPCR. The reference assay detects the RNaseP gene, 
a known two copy region in the diploid genome. Plates 
were run using the Bio-Rad CFX384 machine (Bio-Rad 
Laboratories, Hercules, CA) under recommended PCR 
cycling conditions (95°C for 10 minutes followed by 40 
cycles of 95°C for 15 seconds and 60°C for 1 minute). 
Cycle thresholds (C T ) were calculated using CFX Man- 
ager Software version 2.0 (Bio-Rad) and reformatted 
with a manual C T threshold specification of 0.20 for im- 
port into CopyCaller Software version 1.0 (Applied Bio- 
systems). Wells with VIC C T values exceeding the 
default filtering threshold of 32 were excluded. Relative 
quantification analysis was performed with CopyCaller 
software where discrete copy-number classes were deter- 
mined by employing a maximum-likelihood algorithm 
on the real-time data. 

Results 

Genomic positions are designated according to NCBI36/ 
hgl8 human genome assembly. 

SNP associations 

A total of 292 of the 293 genotyped samples (177 cases 
and 115 controls) passed the default QC call rate thresh- 
old (>86%), with overall sample call rates ranging from 
86.1-99.3%. EIGENSTRAT analysis showed no evidence 
of population stratification (p > 0.15 for first 10 principal 
components). 

No SNP surpassed the Bonferroni-corrected significance 
threshold of 7.1 x 10" 8 for the Fisher s exact allelic or geno- 
typic tests. The top four SNP candidates had an allelic or 
genotypic ^-value between 10" 5 and 10" 6 (Table 1). 

CNV detection and confirmation 

A total of 14,181 autosomal CNVs, 9074 deletions and 
5107 amplifications meeting inclusion criteria, were 
detected among the 169 case and 114 control subjects 
that passed the sample QC. The identified variants ran- 
ged in size from 241 base pairs to nearly 4.1 Mb (median 
= 22.8 kb). The copy-number deletions were merged 
into 2770 regions, as defined by the minimum region of 
overlap across sample calls. Among these, three merged 
regions of recurrent deletions that were detected in 
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Table 1 Association (p-value) of the top four SNP candidates with PE a 



SNP fe 


Minor allele frequency (%) 


Fisher's exact p-value 


Chr. 


Position 


SNP type 
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Distance 
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Allelic 


Genotypic 








gene 


to gene (kb) 




(n = 177) 


(n = 115) 
















rs 1426409 


22.6 


31.1 


2.58 X10" 2 


3.14 X10" 6 
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36850339 


Intergenic 


KIAA1239 


72.7 


rs 17686866 


14.2 


29.1 


1.97X10" 5 


3.80 X10" 6 


1 


215251888 


Intronic 


ESRRG 


0 


rs9831647 


38.1 


48.3 


1 .65 x 1 0" 2 


9.36 X10" 6 


3 


8546110 


Intronic 


LMCD1 


0 


rs 10743565 


24.3 


42.8 


5.00 X10" 6 


1.64X10" 5 


12 


25551846 


Intronic 


IF LTD! 


0 



a SNPs with allelic or genotypic p-value between 10~ 5 and 10~ 6 . 

^Ordered according to smallest to largest allelic or genotypic Fisher's exact p-values. 



multiple cases, but were less common or undetected in 
controls, met the pre-specified screening criteria (Table 2; 
see Additional file 2: Table S2 for an annotated list of 
deleted regions in autosomal chromosomes meeting ini- 
tial prioritization criteria). The exact breakpoints for 
these three candidate CNVs remain undetermined, but 
microarray data suggest that these breakpoints may vary 
in subjects harboring the deletions. Correspondingly, 
copy- number amplifications were merged into 2981 
regions with 21 regions meeting initial prioritization cri- 
teria (Additional file 3: Table S3). For X chromosome, 
107 deletion and 114 amplification calls were included 
and merged into 97 regions of deletion and 97 regions 
of amplification. Only five merged regions of recurrent 
amplifications in X chromosome were found to be 
enriched in cases (Additional file 4: Table S4). These 
minimal common regions of amplification are interest- 
ing candidates for further investigation. 

The most enriched deletion among cases is -15 kb in 
length and was detected in eight cases (4.7%) and one 
control (0.9%). The shared region of overlap for this de- 
letion in 16pl3.11 extends from 14.972 to 14.987 Mb 
and encompasses the PDXDC1 gene. This deletion was 



Table 2 Recurrent copy-number deletions identified in PE 
cases and controls 



CNV region a 


Gene 


Copy-number frequency b 




contents 


Cases c Controls c 


Population- 
based 
controls d 


Chr1 3:83.004-83.045 Mb 


Intergenic 


5/169 0/114 


6/770 


Chr1 6:1 4.972-1 4.987 Mb 


PDXDC1 


8/169 1/114 


14/770 


Chr1 9:48.461 -48.476 Mb 


PSG11 


5/169 1/114 


2/770 



a Minimum region of overlap across all subjects harboring the deletion. 
fa CNV frequencies are derived from SNP microarray data. 
Algorithm detected CNVs were all validated by RT-qPCR with the exception of 
one case in each of the chromosome 13 and 16 regions due to unavailable 
DNA. Furthermore, RT-qPCR identified a heterozygous deletion in one case 
and two control subjects in the chromosome 16 CNV region as well as a case 
subject in the chromosome 19 region that were called as copy normal by the 
algorithms. 

d White female subjects from a GWAS of schizophrenia genotyped using 
Affymetrix SNP 6.0 microarrays. CNVs were detected using the same 
algorithms as the PE GWAS. Four population-based controls did not pass 
sample QC. 



identified in 14 of the 770 (1.8%) schizophrenia study 
controls that passed QC. The 41 kb intergenic deletion 
at 13q31.1 was detected in five cases (3.0%) and zero 
controls with a shared region from 83.004 to 83.045 Mb. 
This deletion was also identified in six of the 770 (0.8%) 
schizophrenia study controls. The nearest gene, 
SLITRK1, is located 304.17 kb downstream of this re- 
gion. The third CNV overlaps with the PSG11 gene (al- 
ternatively spliced as PSG9 and PSGlls) in 19ql3.31 
from 48.461 to 48.476 Mb. This deletion was detected in 
five cases (3.0%), one control (0.9%), and only two 
schizophrenia study controls (0.3%). It was also reported 
in very low frequencies by three studies [27-29] listed in 
the Database of Genomic Variants (http://projects.tcag. 
ca/variation/) [30]. One study reported this deletion in 
2/1854 (0.1%) controls, one in 1/776 (0.1%) controls, 
and the last in 11/2026 (0.5%) controls. 

All samples with available DNA were genotyped using 
pre-designed TaqMan assays for the deletions in 13q31.1 
(ABI assay ID: Hs03297694_cn) and 16pl3.11 
(Hs03938043_cn). The deletion in 19ql3.31 was geno- 
typed using a custom assay with a target region of 
chrl9:48,461,720-48,462,020 designed using the Copy 
Number Assay Workflow Builder (http://www5.applied- 
biosystems.com/tools/cnv/). Two cases failed to amplify 
across all three genomic regions while an additional case 
failed for the chromosome 19 region. These three sam- 
ples were algorithm called as copy normal in the three 
regions of interest. No samples were excluded for sur- 
passing the VIC C T value threshold. 

No false-positive and only a few false-negative algo- 
rithm calls were found by laboratory verification. The 
presence of all putative deletions called by the algo- 
rithms in these three genomic regions was successfully 
confirmed by RT-qPCR, except one case subject in each 
of the chromosome 13 and 16 CNV regions where DNA 
was unavailable. The RT-qPCR assay showed that a het- 
erozygous copy-number deletion was present in 4/155 
cases and none in 98 controls in the 13q31.1 region and 
8/155 cases and 3/98 controls in the 16pl3.11 region, in- 
cluding deletions detected in one case and two controls 
that were not algorithm called. For the deletion in 
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chromosome 19, a heterozygous deletion was confirmed 
in five cases and one control with an additional case de- 
letion detected that was called copy normal by the CNV 
calling algorithms. 

Discussion 

Genome-wide analysis of CNVs identified three rare dele- 
tions enriched in PE, two of which disrupt genes, con- 
firmed by laboratory validation. Although copy-number 
amplifications were not selected for assay, several candidate 
regions were detected in the autosomal and X chromo- 
somes. The most interesting deletion, based on possible 
biological pathways, is the 15 kb deletion in 19ql3.31 that 
encompasses the PSG11 gene. Pregnancy-specific glycopro- 
teins (PSGs) are mainly produced by placental syncytiotro- 
phoblasts during pregnancy and constitute a subgroup of 
the carcinoembryonic antigen family, which belongs 
to the immunoglobulin superfamily [31]. Studies 
have shown that several members of the PSG gene 
family, including PSG 11, induce dose-dependent monocy- 
tic secretion of anti-inflammatory cytokines, which 



physiologically contribute to the maintenance of a success- 
ful pregnancy. In contrast, activation of coagulation 
mechanisms by pro-inflammatory cytokines can lead to 
maternal endothelial dysfunction, vasculitis, and conse- 
quently uteroplacental hypoxia [32]. Hypoxia plays a crucial 
role in placental pathologies such as PE. Inadequate utero- 
placental oxygenation is believed to be involved in molecu- 
lar events leading to the clinical manifestations of PE [33]. 

PSG 11 is located at the telomeric end of the PSG gene 
family cluster (chrl9:47.918-48.465 Mb) and is 
arranged in tandem with the other PSG genes [34]. 
This cluster has a high density of segmental duplica- 
tions (low copy repeats). CNVs are not uniformly 
distributed in the human genome, but tend to be 
enriched in regions of segmental duplication. Seg- 
mental duplications predispose affected regions to 
recurrent chromosomal rearrangements through 
non-allelic homologous recombination and may be 
the underlying mechanism in the formation of CNVs 
within this cluster (see Figure 3 for locations of pu- 
tative segmental duplications within the PSG gene family 
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H hgl8 
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Figure 3 UCSC Genome Browser plot of the PSG gene family region (chrl 9:47.91 8-48.465 Mb). (A) Each red horizontal bar represents the 
length and breakpoints of a putative deletion called in PE cases or controls. (B) UCSC Genes located within this region. Asterisks indicate the 
genomic positions of nominally significant SNPs (from left to right: rs4030933, rs21 59027, rsl 041 731 9, and rs1 04021 73). (C) Segmental duplications 
of >1 kb with 90-98% sequence similarity. 
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cluster) [35,36]. According to the microarray data, there is 
a highly variable genomic region upstream of the deletion 
of interest in 19ql3.31 (Figure 4). It is apparent that the 3' 
boundary of this upstream CNV hotspot terminates before 
the alternatively spliced PSG11 region. However, the 
enriched minimum deleted region appears to be primarily 
an extension of this more common upstream deleted re- 
gion (5/6 deletions). Although the exact breakpoints remain 
undetermined, RT-qPCR confirmed the deletion break- 
points to be varying in this genomic locus; the minimal 
deleted region was not detected in every sample with the 
upstream deletion. There is also no evidence of appreciable 
enrichment in the minimum region of overlap (48.396- 
48.448 Mb) for upstream deletion, which was detected in 
29 cases (17.2%) and 14 controls (12.3%). Furthermore, in- 
spection of the entire PSG gene family region 
(chrl9:47.918-48.465 Mb) suggests that only the deletion in 
the 48.461 to 48.476 Mb region, which disrupts PSG11, is 
enriched in PE cases compared to controls (Figure 3). 

Studies have shown that CNVs can lead to diseases or 
other phenotypes by various mechanisms, including the 
disruption of functional genes [37]. Therefore, deletions 
are under strong purifying selection and are preferen- 
tially located outside of genes and highly conserved ele- 
ments in the genome [15]. The rarity of the deletion in 
the functionally relevant PSG 11 gene, within a hotspot 
of genomic instability, suggests that there is selective 
pressure acting against this CNV. Consequently, al- 
though PSG 11 and its alternatively spliced variants have 



yet to be directly linked to PE, they represent intriguing 
candidates for future research. 

The validity of CNVs identified in array-based studies 
is algorithm-dependent. High variability in findings 
exists among CNV detection methods, along with sub- 
stantial false-positive and false-negative rates [38]. In an 
attempt to increase the accuracy of CNV prediction, this 
study employed three algorithms (Birdsuite (Birdseye 
and Canary), PennCNV, and QuantiSNP) with stringent 
overlapping criteria to call CNVs on a genome-wide 
level. RT-qPCR confirmed 4/4 predicted deletions in 
13q31.1 with DNA available for assay. All calls in this re- 
gion were in 100% agreement by the three algorithms, 
with no false-positive or false-negative calls. For the 
deletions in 16pl3.11 and 19ql3.31, all calls in cases and 
controls with available DNA were confirmed by assay. 
However, there were false-negative calls in one case and 
two controls for the former and in one case subject for 
the latter CNV. In both of these deletions, the minimum 
region of overlap among cases and controls was called 
by at least two algorithms. Visual inspection of signal in- 
tensity plots for the assay-confirmed CNV calls sup- 
ported the heterozygous copy-number deletion call, 
where the LRR drops to the -0.5 region and the BAF 
clusters around 0 or 1. Although the plots revealed that 
LRR and BAF patterns were consistent with a deletion, 
the false-negative calls appear to be caused by a high 
signal-to-noise ratio in the array data at these regions. 
Despite the potential exclusion of candidate CNV regions 
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Figure 4 UCSC Genome Browser plot of the copy-number deletion at chrl 9:48.461 -48.476 Mb. (A) The vertical black lines indicate the 
minimum region of overlap across all subjects harboring the deletion. Each red horizontal bar represents the length and breakpoints of a 
deletion detected in either PE cases or controls. Exact CNV breakpoints are unknown. (B) UCSC Genes located within this region. The asterisk 
denotes the target region (chrl 9:48,461 ,720-48,462,020) of the custom TaqMan copy-number assay. (C) Genomic positions of SNP and structural 
variation copy-number probes used in the Asymetrix SNP 6.0 microarray. 
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due to the stringent CNV detection method applied in this 
study, it was successful in reducing the false-positive and 
false-negative rates that are common in studies that rely 
on array-based technologies to infer CNVs. 

The present GWAS did not identify any variants that 
were associated with PE at a genome-wide level of sig- 
nificance. The lack of significant findings may be due to 
the insufficient statistical power of this study to detect 
variants of small or moderate effect size, increasing the 
false- negative rate. Furthermore, forcing raw measure- 
ments with continuous distributions (i.e. probe in- 
tensity measurements) into discrete copy-number classes 
(e.g. gain, no change, loss) when calling CNVs using array- 
based technologies may result in the loss of substantial stat- 
istical power [39]. Discovery of disease-associated variants 
may be biased towards genomic regions with better cover- 
age by the SNP microarray used in the study. Study findings 
also assumed that genotype does not influence subject se- 
lection. Although a reasonable assumption, there may be 
situations where the genotype affects participation rates 
through its association with certain selection factors [40]. 

Although no SNPs reached genome-wide significance, 
it is interesting to note that four SNPs within the PSG 
gene family cluster (chrl9:47.918-48.465 Mb) and imme- 
diate flanking regions (±10 kb) reached nominal signifi- 
cance, including one in the intronic region of PSG1, 3, 4, 
and 8 genes, one in the intronic region of PSG2, 3, 6, 7, 
and 11 genes, and two located intergenically between 
PSG2 and PSG5 (see Figure 3 for the SNP positions 
within the gene cluster). These nominally significant 
SNPs may be functionally important; regulatory ele- 
ments, such as enhancers and repressors, may reside in 
intronic regions or up- and downstream of the transcrip- 
tional unit [41]. Further replication and functional stud- 
ies are warranted to elucidate the roles of these putative 
risk variants within the PSG gene family region in PE. 

Conclusions 

Genome-wide CNV analysis discovered three rare but 
recurrent deletions that may confer risk for PE, includ- 
ing a potentially functionally important copy-number 
deletion in the PSG 11 gene. Larger replication studies 
are needed to confirm these findings. Although no sig- 
nificant SNPs were discovered, the list of top SNP candi- 
dates generated by the present study may be a useful 
basis for future genetic association studies of PE. 
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